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Abstract 

This paper proposes fonnulations and algorithms for design optimization under both 
aleatory (i.e., natural or physical variability) and epistemic uncertainty (i.e., imprecise 
probabilistic infonnation), from the perspective of system robustness. The proposed 
formulations deal with epistemic uncertainty arising from both sparse and interval data 
without any assumption about the probability distributions of the random variables. A 
decoupled approach is proposed in this paper to un-nest the robustness-based design from 
the analysis of non-design epistemic variables to achieve computational efficiency. The 
proposed methods are illustrated for the upper stage design problem of a two-stage-to- 
orbit (TSTO) vehicle, where the infonnation on the random design inputs are only 
available as sparse point and/or interval data. As collecting more data reduces uncertainty 
but increases cost, the effect of sample size on the optimality and robustness of the 
solution is also studied. A method is developed to detennine the optimal sample size for 
sparse point data that leads to the solutions of the design problem that are least sensitive 
to variations in the input random variables. 


1. Introduction 

In deterministic design optimization, it is generally assumed that all design variables 
and system variables are precisely known; the influence of natural variability and data 
uncertainty on the optimality and feasibility of the design is not explicitly considered. 
However, real-life engineering problems are non-deterministic, and a deterministic 
assumption about inputs may lead to infeasibility or poor perfonnance (Sim, 2006). In 
recent years, many methods have been developed for design under uncertainty. 
Reliability-based design (e.g., Chiralaksanakul and Mahadevan, 2005; Ramu et al, 2006; 
Agarwal et al, 2007and Du and Huang, 2007) and robust design (e.g., Parkinson et al, 
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1993; Du and Chen, 2000; Doltsinis and Kang, 2004 and Huang and Du, 2007) are two 
directions pursued by these methods. While reliability-based design aims to maintain 
design feasibility at desired reliability levels, robust design optimization attempts to 
minimize variability in the system perfonnance due to variations in the inputs (Lee et al, 
2008). In recent years, several methods have also been proposed to integrate these two 
paradigms of design under uncertainty (e.g., Du et al, 2004, Lee et al, 2008). 

Taguchi proposed robust design methods for selecting design variables in a manner 
that makes the product performance insensitive to variations in the manufacturing process 
(Taguchi, 1993). Taguchi’s methods have widespread applications in engineering; 
however, these methods are implemented through statistical design of experiments and 
cannot solve problems with multiple measures of performances and design constraints 
(Wei et al, 2009). With the introduction of nonlinear programming to robust design, it has 
become possible to achieve robustness in both performance and design constraints (Du 
and Chen, 2000). 

The essential elements of robust design optimization are: (1) maintaining robustness 
in the objective function (objective robustness); (2) maintaining robustness in the 
constraints (feasibility robustness); (3) estimating mean and measure of variation 
(variance) of the perfonnance function; and (4) multi-objective optimization. The rest of 
this section briefly reviews the literature with respect to these four elements and 
establishes the motivation for the current study. 

Objective robustness 

In robust optimization, the robustness of the objective function is usually achieved by 
simultaneously optimizing its mean and minimizing its variance. Two major robustness 
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measures are available in the literature: one is the variance, which is extensively 
discussed in the literature (Du and Chen, 2000; Lee and Park, 2001 and Doltsinis and 
Kang, 2004) and the other is based on the percentile difference (Du et al, 2004). Although 
the percentile difference method has the advantage that it contains the information of 
probability in the tail regions of the performance distribution, this method is only 
applicable to unimodal distributions. Variance as a measure of variation of the 
performance function can be applied to any distribution (unimodal or multimodal), but it 
only characterizes the dispersion around the mean (Huang and Du, 2007). 

Feasibility robustness 

Feasibility robustness i.e., robustness in the constraints can be defined as satisfying 
the constraints of the design in the presence of uncertainty. Du and Chen (2000) 
classified the methods of maintaining feasibility robustness into two categories, methods 
that use probabilistic and statistical analysis, and methods that do not require them. 
Among the methods that require probabilistic and statistical analysis, a probabilistic 
feasibility formulation (Du and Chen, 2000 and Lee et al, 2008), and a moment matching 
formulation (Parkinson et al, 1993) have been proposed. Du and Chen (2000) used a most 
probable point (MPP)-based importance sampling method to reduce the computational 
burden associated with the probabilistic feasibility formulation. The moment matching 
formulation is a simplified approach which requires only the constraints on the first and 
second moments of the performance function to be satisfied, and assumes that the 
performance function is normally distributed. A variation of this approach, the feasible 
region reduction method has been described in Park et al (2006), which is more general 
and does not require the normality assumption. This is a tolerance design method, where 
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width of the feasible space in each direction is reduced by the amount ka , where k is a 
user-defined constant and a is the standard deviation of the performance function. This 
method only requires the mean and variance of the performance function. 

Methods that do not require probabilistic and statistical analysis are also available, for 
example, worst case analysis (Parkinson et al, 1993), corner space evaluation 
(Sundaresan et al, 1995), and manufacturing variation patterns (MVP) (Yu and Ishii, 
1998). A comparison study of the different constraint feasibility methods can be found in 
Du and Chen (2000). 

Estimating mean and variance of the performance function 

Various methods have been reported in the literature to estimate the mean and 
standard deviation of the perfonnance function. These methods can be divided into three 
major classes: (i) Taylor series expansion methods, (ii) sampling-based methods and (iii) 
point estimate methods (Huang and Du, 2007). 

The Taylor series expansion method (Haidar and Mahadevan, 2000; Du and Chen, 
2000; and Lee et al, 2001) is a simple approach. However, for a nonlinear perfonnance 
function, if the variances of the random variables are large, this approximation may result 
in large errors (Du et al., 2004). Although a second-order Taylor series expansion is 
generally more accurate than the first-order approximation, it is also computationally 
more expensive. 

Sampling-based methods require infonnation on distributions of the random 
variables, and are expensive. Efficient sampling techniques such as importance sampling, 
Latin hypercube sampling, etc. (Robert and Cesalla, 2004) can be used to reduce the 
computational effort, but are still prohibitive in the context of optimization. Surrogate 
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models (Ghanem and Spanos 1991; Bichon et al, 2008; Cheng and Sandu, 2009) may be 
used to further reduce computational effort. 

In an attempt to overcome the difficulties associated with the computation of 
derivatives required in Taylor series expansion, Rosenlblueth (1975) proposed a point 
estimate method to compute the first few moments of the performance function. Different 
variations of this point estimate method (Hong, 1998; Zhao and Ono, 2000 and Zhao and 
Ang, 2003) have been studied. Although point estimate methods are easier to implement, 
the accuracy may be low and may generate points that lie outside the domain of the 
random variable. 

Multi-objective optimization 

Robustness-based optimization considers two objectives: optimize the mean of the 
objective function and minimize its variation. An extensive survey of the multi-objective 
optimization methods can be found in Marler and Arora (2004). Among the available 
methods, the weighted sum approach is the most common approach to multi-objective 
optimization and has been extensively used in robust design optimization (Lee and Park, 
2001; Doltsinis and Kang, 2004; Zou and Mahadevan, 2006). The designer can obtain 
alternative design points by varying the weights and can select the one that offers the best 
trade-off among multiple objectives. Despite its simplicity, the weighted sum method 
may not obtain potentially desirable solutions (Park et al, 2006). Another common 
approach is the s-constraint method in which one of the objective functions is optimized 
while the other objective functions are used as constraints. Despite its advantages over 
weighted sum method in some cases, the s-constraint method can be computationally 
expensive for more than two objective functions (Mavrotas, 2009). 
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Other methods include goal programming (Zou and Mahadevan, 2006), compromise 
decision support problem (Bras and Mistree, 1993, 1995; Chen et al, 1996), compromise 
programming (CP) (Zalney, 1973; Zhang, 2003; Chen et al, 1999) and physical 
programming (Messac, 1996; Messac et al, 2001; Messac and Ismail- Yahaya, 2002; Chen 
et al, 2000). Each of these methods has its own advantages and limitations. 

Although there is now an extensive volume of literature for robust optimization 
methods and applications, all these methods have only been studied with respect to 
physical or natural variability represented by probability distributions. Uncertainty in 
system design also arises from other contributing factors. Sources of uncertainty may be 
divided into two types: aleatory and epistemic (Oberkampf et. al., 2004). Aleatory 
uncertainty is irreducible. Examples include phenomena that exhibit natural variation like 
operating conditions, material properties, geometric tolerances, etc. In contrast, epistemic 
uncertainty results from a lack of knowledge about the system, or due to approximations 
in the system behavior models, or due to limited or subjective (e.g., expert opinion) data; 
it can be reduced as more information about the system is obtained. 

One type of data uncertainty involves having limited data to properly define the 
distribution parameters of the random variables. This type of uncertainty may be reduced 
by collecting more data. In some cases of data uncertainty, distribution information of a 
random variable may only be available as intervals given by experts. The objective of this 
paper is to develop an efficient robust optimization methodology that includes both 
aleatory and epistemic uncertainty described through sparse point data and interval data. 

A few studies on robust design optimization are reported in the literature to deal with 
epistemic uncertainty arising from lack of information. Youn et al (2007) used a 
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possibility-based method, and redefined the perfonnance measure of robust design using 
the most likely values of fuzzy random variables. Dai and Mourelatos (2003) proposed 
two two-step methods for robust design optimization that can treat aleatory and epistemic 
uncertainty separately using a range method and a fuzzy sets approach. Most of the 
current methods of robust optimization for epistemic uncertainty need additional non- 
probabilistic formulations to incorporate epistemic uncertainty into the robust 
optimization framework, which may be computationally expensive. However, if the 
epistemic uncertainty can be converted to a probabilistic format, the need for these 
additional formulations is avoidable, and well-established probabilistic methods of robust 
design optimization can be used. Therefore, there is a need for an efficient robust design 
optimization methodology that deals with both aleatory and epistemic uncertainty. 

In this paper, we propose robustness-based design optimization formulations that 
work under both aleatory and epistemic uncertainty using probabilistic representations of 
different types of uncertainty. Our proposed formulations deal with both sparse point and 
interval data without any assumption about probability distributions of the random 
variables. 

The performance of robustness-based design can be defined by the mean and 
variation of the performance function. In our proposed formulations, we obtain the 
optimum mean value of the objective function (e.g., gross weight) while also minimizing 
its variation (e.g., standard deviation). Thus, the design will meet target values in terms of 
both design bounds and standard deviations of design objectives and design variables 
thereby ensure feasibility robustness. 
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A Taylor series expansion method is used in this paper to estimate the mean and 
standard deviation of the performance function, which requires means and standard 
deviations of the random variables. However, with sparse point data and interval data, it 
is impossible to know the true moments of the data, and there are many possible 
probability distributions that can represent these data (Zaman et al, 2009). In this paper, 
we propose methods for robustness-based design optimization that account for this 
uncertainty in the moments due to sparse point data and interval data and thereby include 
epistemic uncertainty into the robust design optimization framework. As collecting more 
data reduces uncertainty but increases cost, the effect of sample size on the optimality 
and the robustness of the solution is also studied. A method to determine the optimal 
sample size for sparse point data that will lead to the minimum scatter on solutions to the 
design problem is also presented in this paper. 

In some existing methods for robust design under epistemic uncertainty, all the 
epistemic variables are considered as design variables (Youn et al, 2007). However, if the 
designer does not have any control on an epistemic variable (e.g., Young’s modulus in 
beam design), considering that variable as a design variable might lead to a solution that 
could underestimate the design objectives. Therefore, in this paper, we propose a general 
formulation for robust design that considers some of the epistemic variables as non- 
design variables, which leads to a conservative design under epistemic uncertainty. 

Note that the proposed robustness-based design optimization method is general 
and capable of handling a wide range of application problems under data uncertainty. The 
proposed methods are illustrated for the conceptual level design process of a two-stage- 
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to-orbit (TSTO) vehicle, where the distributions of the random inputs are described by 
sparse point and/or interval data. 

The rest of the paper is organized as follows. Section 2 proposes robustness-based 
design optimization framework for sparse point data and interval data. In Section 3, we 
illustrate the proposed methods for the conceptual level design process of a TSTO 
vehicle. Section 4 provides conclusions and suggestions for future work. 

2. Proposed methodology 

Deterministic design optimization 

In a deterministic optimization formulation, all design variables and system 
variables are considered deterministic. No random variability or data uncertainty is taken 
into account. The deterministic optimization problem is formulated as follows: 
min f(x) 

X 

s.t. LB < gfx) < UB for all i (1) 

lb<x< ub 

where fix ) is the objective function, x is the vector of design variables, g, (x) is the /th 
constraint, LB and UB are the vectors of lower and upper bounds of constraints g's and 

lb and ub are the vectors of lower and upper bounds of design variables. 

In practice, the input variables might be uncertain and solutions of this 
deterministic fonnulation could be sensitive to the variations in the input variables. 
Robustness-based design optimization takes this uncertainty into account. The optimal 
design points obtained using the detenninistic method could be used as initial guesses in 
robustness-based optimization. 
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Robustness-based design optimization 

In the proposed methodology, we use variance as a measure of variation of the 
performance function in order to achieve objective robustness, the feasible region 
reduction method to achieve feasibility robustness, a first-order Taylor series expansion 
to estimate the mean and variance of the performance function, and a weighted sum 
method for the aggregation of multiple objectives. This combination of methods is only 
used for the sake of illustration. Other approaches can be easily substituted in the 
proposed methodology. The robustness-based design optimization problem can now be 
formulated as follows: 

min f(/u,<j) = (w*p f -+v*cr,) 

s.t. LB + k<j(g : (d,z))< E(gfd,z))<UB -k<j(gfd,z)) for all/ (2) 

lb + kcr{x i ) < d t < ub - kcr(x ; ) for / = 1,2,..., nrdv 
lb < d, <ub for / = 1,2 ,...,nddv 

where d is the vector of deterministic design variables as well as the mean values of the 
uncertain design variables x; nrdv and nddv are the numbers of the random design 
variables and deterministic design variables, respectively; and z is the vector of non- 
design input random variables, whose values are kept fixed at their mean values as a part 
of the design. The weighting coefficients w > 0 and v > 0 represent the relative 
importance of the objectives ju f and a f in Eq. (2); g t (d ,z) is the /th constraint; 

E(g i {d,z)) is the mean and aigjd ,z j) is the standard deviation of the /th constraint. LB 
and UB are the vectors of lower and upper bounds of constraints gjs ; lb and ub are the 
vectors of lower and upper bounds of the design variables; a(x) is the vector of standard 
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deviations of the random variables and k is some constant. The role of the constant k is to 


adjust the robustness of the method against the level of conservatism of the solution. It 
reduces the feasible region by accounting for the variations in the design variables and is 
related to the probability of constraint satisfaction. For example, if a design variable or a 
constraint function is normally distributed, k =1 corresponds to the probability 0.8413, k 
=2 to the probability 0.9772, etc. 

Note that the robust design formulation in Eq. (2) is a standard nonlinear multi- 
objective optimization formulation. The optimality conditions of such a formulation have 
been extensively described in the literature including Cagan and Williams (1993) and 
Marler and Arora (2004). 

In the proposed formulation, the perfonnance functions considered are in terms of the 
model outputs. The means and standard deviations of the objective and constraints are 
estimated by using a first-order Taylor series approximation as follows: 

Performance function: Y = g(X { , X 2 ,...., X n ) (3) 

First-order approximate mean of y: E(Y ') « g{ju x ,->Mx 1 ) (4) 


r x 2 

dg 


First-order variance of y: Var(Y') « ^ Var{X i ) + ^ — — Cov{x j , X . ) (5) 

dX , ^ v 


dg dg 


U 


m %dX t dXj 
i*j 


The implementation of Eq. (2) requires that variances of the random design 
variables A) and the means and variances of the random non-design variables Z, be 
precisely known, which is possible only when a large number of data points are available. 
In practical situations, only a small number of data points may be available for the input 
variables. In other cases, information about random input variables may only be specified 
as intervals, as by expert opinion. This is input data uncertainty, causing uncertainty 
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regarding the distribution parameters (e.g., mean and variance) of the inputs X t and Z,. 
Robustness-based optimization has to take this into account. In the following subsections, 
we propose a new methodology for robustness-based design optimization that accounts 
for data uncertainty. 

2,1 Robustness-based design optimization under data uncertainty 

The inclusion of epistemic uncertainty in robust design adds another level of 
complexity in the design methodology. The design variables d and/or the input random 
variables z in Eq. (2) might have epistemic uncertainty. Since the designer does not have 
any control on the non-design epistemic variables z, the design methodology has to 
employ a search among the possible values of such epistemic variables in order to find an 
optimal solution. In such case, we get a conservative robust design. The robustness-based 
design optimization problem can now be formulated with the following generalized 
statement: 

maxlmin /(/h ct) = (w* Hf +v*a f ) ) 

s.t. LB + kcr(g j (d,ju z ))< E(g i (d,z))<UB-kcr(g i (d,ju z )) for all/ (6) 

lb + k<j(x) <d <ub- k<j(x) 

Z ,^B : ^ z u 

where Z/ and Z u are the vectors of lower and upper bounds of the decision variables ii z of 
the outer loop optimization problem. 

Note that in this formulation, the inner loop decision variables d may consist of stochastic 
design variables as well as epistemic design variables. The inner loop optimization is a 
design optimization problem, where a robust design optimization is carried out for a fixed 
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set of non-design epistemic variables. The outer loop optimization is the analysis for the 
non-design epistemic variables, where the optimizer searches among the possible values 
of the non-design epistemic variables for a conservative solution of the robust design 
problem. 

This nested optimization problem can be decoupled and expressed as: 
d* = arg min (w * ju f [d,ju z )+v* a f (d, /ul )) 

d 

s.t. LB + k(j(g i (d,ju*))< E(g j (d,z))<UB -ka(g i (d,ju*)) for all/ (7) 

lb + kcr(x) <d <ub- k<y(x) 

f/ : = arg max (w * ju f (d\fi z )+v*<7 f (d*,fi z )) (8) 

Mz 

s.t. LB + kcr(g i (d* E(g t (d, z)) < UB - ka(g i (d* ,/u : )) forall/ 

The optimization problems in Eqs. (7) and (8) are solved iteratively until convergence. 
Note that the first constraint (i.e., the robustness constraint) in Eq. (8) is required to 
ensure that the optimization is driven by all non-design epistemic variables, because 
sometimes the objective function may not be a function of all non-design epistemic 
variables. In cases when the objective function is the function of all non-design epistemic 
variables, this constraint is not required. 

2,1.1 Robustness-based design with sparse point data 

In this section, we propose a methodology for robustness-based design 
optimization with sparse point data. It is assumed that only sparse point data are available 
for the uncertain design variables as well as non-design epistemic variables. 
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Since the data size is small, there is uncertainty about the mean and variance 
calculated from the samples. The chi-square distribution is a good assumption for the 
distribution of the variance, if the underlying population is normal. The two-sided (1-a) 
confidence interval for the population variance a 2 can be expressed as (Haidar and 
Mahadevan, 2000): 

{n-\)s\{n-\y (9) 

_ C l-a/2,n-\ C a/2,n-l 

where n is the sample size, 5 is the sample standard deviation of sparse point data, and 
c an,n-\ ' s obtained from the chi-square distribution at (n- 1) degrees of freedom and a 

significance level. Note that Eq. (9) can still be used to obtain approximate confidence 
bounds for variance if the underlying population is not normal. However, in such cases, 
other approximation methods (Bonett, 2006; Cojbasic and Tomovic, 2007) can be 
used to obtain more reliable estimates of confidence bounds. In robustness-based design 
optimization, we are interested in obtaining a solution that is least sensitive to the 
variations in the input random variables; therefore we use the upper bound variances for 
the input random variables Xj and z, to solve the formulations in Eqs. (7)-(8) for sparse 
point data. 

For non-design epistemic variables described by sparse point data, the constraints 
on the decision variables in Eq. (8) are implemented through the construction of 
confidence intervals about means. As the design variables are described by the sparse 
point data, it is possible that the underlying distributions of the design variables might 
have major deviations from normality. Therefore, we have used the Johnson's modified t 
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statistic (Johnson, 1978) to construct the confidence bounds on means of the design 
variables as follows: 


Z, 


= Z — t 


a/2,n-\ 



Jh_ 

6s 2 n 


= z + 


t 


a!2,n-\ 



jh 

6 s 2 n 


( 10 ) 


where z is the vector of means of the epistemic variables, s is the vector of standard 
deviations, n is the sample size of the sparse point data, // 3 is the third central moment 

and t a/2 n _ 1 is obtained from the Student t distribution at (n- 1) degrees of freedom and a 

significance level. This modified statistic takes into account the skewness of the 
distribution and thus provides a better estimate of the confidence bound in the presence of 
limited data. 

The optimization fonnulation shown in Eqs. (7)-(8) involves aggregation of 
multiple objectives. In the proposed formulations, the aggregate objective function 
consists of two types of objectives, expectation and standard deviation of model outputs. 
Since different objectives have different magnitudes, a scaling factor has to be used in the 
formulation. 

2,1.2 Determination of optimal sample size for sparse point data 

The optimal solutions depend on the sample size of the sparse data as will be 
discussed in Section 3.1. Therefore, it is of interest to determine the optimal sample size 
of the sparse data that leads to the solution of the design problem that is least sensitive to 
the variations of design variables. This will facilitate resource allocation decisions for 
data collection. The following two optimization formulations are solved iteratively until 
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convergence for the optimal sample sizes of the epistemic design variables ( n d ) and 

epistemic non-design variables(/r*).The formulations in Eq. (11)-(12) are the weighted 

sum formulations of a three-objective optimization problem, where the first and second 
objectives are the mean and standard deviation of GW respectively and the third 
objective is the total cost of obtaining samples for all the random variables. 


[ ,* * 

[d ,n d = 


= arg mm 

d.tij 


w 


tn Lj 

■■ E (g, (d,jul)) + v * aig^d, //*), n d ,n e ) + (l-w-v)* £ n d c d + j] n c„ 


j = i j = i 

n dj ^b dj for all j 


A/ 


JJ 


\j = i j = i 

s.t. LB + kcr(g i (d,ju* z ),n d ,n e )< E(g i {d,z))<UB -kcr{g i (d,// z ),n d ,n e ) forall/ (11) 

lb + kcr(x, n d ) < d <ub - k<j(x, n d ) 


[ * * 

M z ,n e \= 


arg max 

H Z ’ n e 


W‘ 


rn y 

: E (g,- (d * ,//)) + v * cr(g, (J*, // z ), n d ,n e ) + (l-w-v)* ^ n] c d + £ n c 


A/ 


a J Z-J ejr'ej 

\J = 1 >1 yy 


LB + ka(g i (d* ,ju z ),n d ,n e )< E(g i (d,z))<UB -ka^g^d* ,ju z ),n d ,n e ) forall/ (12) 

^ Z u( n e) 
m q 

Iv^ + Zv^- C 

7=1 7=1 

<h ej for all j 


where w > 0 and v > 0 are the weighting coefficients that represent the relative 
importance of the objectives; n d and n e are the sample sizes and b d and b e are the 

maximum sample size possible for the /th design and non-design random variables, 
respectively. The number of design and non-design random variables are denoted by m 
and q, respectively. Terms c d and c e arc the cost of obtaining one sample for the /th 
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random design and non-design variables, respectively and C is the total cost allocated for 
obtaining samples for all the random variables. Note that as in Eq. (8), the robustness 
constraint in Eq. (12) is only required if the objective function is not a function of all non- 
design epistemic variables. The optimization fonnulation presented above is a mixed- 
integer nonlinear problem. A relaxed problem is solved in Section 3. 

2,1.3 Robustness-based design with interval data 

This section proposes a methodology for robustness-based design optimization 
with interval data. In this case, the only infonnation available for one or more input 
random variables is in the form of single interval or multiple interval data. The following 
discussion develops a methodology to solve the formulations in Eq. (7)-(8) for 
uncertainty represented through interval data. 

For interval data, the moments (e.g., mean and variance) are not a single value, 
rather only bounds can be given (Zaman et al, 2009). We have proposed methods to 
compute the bounds of moments for both single and multiple interval data in Zaman et al 
(2009). The methods for computing bounds of the first two moments for interval data are 
given later in this section. Once the bounds on the mean and variance of interval data are 
estimated, we use the upper bounds of sample variance to solve the formulations of 
robust design under uncertainty represented through single interval or multiple interval 
data. Therefore, the resulting solution becomes least sensitive to the variations in the 
uncertain variables. 
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For non-design epistemic variables described by interval data, the constraints on 
the decision variables in Eqs. (8) and (12) are implemented through estimating the 
bounds of the means by the methods as described later in this section. 

The following discussions briefly summarize the methods to estimate the bounds 
on the first two moments for single interval and multiple interval data, respectively. 

Bounds on moments with single interval data 

The methods for calculating bounds on the first two moments for single interval 
data are summarized in Table 1 below. 


Table 1: Methods for calculating moment bounds for single interval data 


Moment 

Condition 

Formula 

Lower bound 

Upper bound 

1 

PMF = 1 at lower endpoint 

PMF = 1 at upper endpoint 

fat 

II 

sT 

= 0 elsewhere 

= 0 elsewhere 

2 

PMF = 1 at any point 
= 0 elsewhere 

PMF = 0.5 at each 
endpoint 

M 2 =e(x 2 )-{e{x)) 2 


Note: e{x) = Y j x 1 p(x 1 ) e(x 1 )=Y j x 2 i pix,) 

i = 1 i = 1 

where p(x t ) = Probability Mass Function (PMF) 


In Table 1, the formulas lead to the lower and upper endpoints of the interval as 
the lower and upper bounds for the first moment, respectively. The formulas also imply 
that the lower bound for the second moments is zero. 


Bounds on moments with multiple interval data 

The methods for calculating bounds on the first two moments for multiple interval 
data are summarized in Table 2 below. 
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Table 2: Methods for calculating moment bounds for multiple interval data 


Moment 

Formula 

1 

k "1- £>. 

Li z J 

2 

f o 

i » j » 

min/ max M 2 =— ^ x t ^ x j 

n . n . , 

,=i v j=\ ; 

s.t. Ibj < Xj < ubj i = {l,. ..,/?} 


Note: [Ibi ub ,] = Set of intervals n = Number of intervals 


Once the bounds on the mean and variance of interval data are estimated by the 
methods described above, we can now use these bounds to solve the formulations of 
robustness-based design optimization under uncertainty represented through single 
interval or multiple interval data. In the following section, we illustrate our proposed 
formulations for robustness-based design optimization with both sparse point and interval 
data. 


3. Example Problem 

In this section, the proposed methods are illustrated for the conceptual level 
design process of a TSTO vehicle. The multidisciplinary system analysis consists of 
geometric modeling, aerodynamics, aerothermodynamics, engine performance analysis, 
trajectory analysis, mass property analysis and cost modeling (Stevenson et al, 2002). In 
this paper, a simplified version of the upper stage design process of a TSTO vehicle is 
used to illustrate the proposed methods. High fidelity codes of individual disciplinary 
analysis are replaced by inexpensive surrogate models. Figure 1 illustrates the analysis 
process of a TSTO vehicle. 
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Geometric 
' inputs 



Aero inputs 


Figure 1: TSTO vehicle concept 

The analysis outputs (performance functions) are Gross Weight (GW), Engine 
Weight (EW), Propellant Fraction Required (PFR), Vehicle Length (VL), Vehicle Volume 
(VV) and Body Wetted Area (BWA). Each of the analysis outputs is approximated by a 
second-order response surface and is a function of the random design variables Nozzle 
Expansion Ratio (ExpRatio), Payload Weight (Payload), Separation Mach (SepMach), 
Separation Dynamic Pressure (SepQ), Separation Flight Path Angle (SepAngle), and 
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Body Fineness Ratio (Fineness). Each of the random variables is described by either 


sparse point data or interval data. 

The objective is to optimize an individual analysis output (e.g., Gross Weight) 
while satisfying the constraints imposed by each of the design variables as well as all the 
analysis outputs. We note here that we have assumed independence among the uncertain 
input variables and thereby ignored the covariance terms in Eq. (5) to estimate the 
variance of the performance function in each of the following examples. The numerical 
values of the design bounds for the design variables and analysis outputs are given in 
Tables 3 and 4, respectively. 

Table 3: Design bounds for the design variables 


Design Variable 

lb 

ub 

ExpRatio 

40 

150 

Payload 

8000 

40000 

SepMach 

7 

12 

SepQ 

40 

200 

Sep Angle 

7 

12 

Fineness 

4 

6 


Table 4: Design bounds for the analysis outputs 


Analysis output 

LB 

UB 

GW 

0 

100e+005 

EW 

0 

100e+005 

PFR 

0.4 

0.95 

VL 

0 

100e+002 

VV 

0 

100e+003 

BWA 

0 

100e+005 


3.1. Robustness-based design optimization with sparse point data 

The methodology proposed in Section 2.1.1 is illustrated here for the TSTO 
problem. It is assumed that all the input variables x are described by sparse point data as 
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given in Table 5. For this example, the input variable SepQ is assumed to be a non-design 
epistemic variable and all the remaining variables are assumed to be design variables. 
The design bounds for the respective design variables and the analysis outputs are given 
in Tables 3 and 4. 


Table 5: Sparse Point Data for the random design variables 


Sample 

ExpRatio 

Payload 

SepMach 

SepQ 

Sep Angle 

Fineness 

01 

85.23 

2.8952e+004 

10.85 

115.38 

9.12 

4.07 

02 

82.25 

2.9747e+004 

10.56 

111.63 

9.49 

4.02 

03 

88.79 

2.6638e+004 

10.93 

118.57 

9.85 

4.47 

04 

83.93 

2.8356e+004 

10.70 

111.60 

9.87 

4.15 

05 

80.67 

2.7193e+004 

10.58 

100.34 

9.27 

4.15 

06 

91.32 

2.9168e+004 

10.82 

102.42 

9.21 

4.17 

07 

83.64 

2.8844e+004 

10.88 

117.25 

9.57 

4.23 

08 

86.64 

2.5836e+004 

10.99 

109.69 

9.64 

4.32 

09 

90.32 

2.9310e+004 

10.00 

116.90 

9.42 

4.01 

10 

85.39 

2.9949e+004 

10.87 

104.19 

9.21 

4.42 


The design problem becomes: 


d* = argmin(>v* E (GW) + (l-w)* a (GW)) 

d 

s.t. LB \ + ka(GW) < E(GW) < UB ] - ka(GW) 

LB 2 + kcr(EW) < E(EW) < UB 2 - ka(EW) 

LB 2 + ka(PFR) < E(PFR) < UB 2 - kcr(PFR) (1 3) 

LB 4 + ka(VL) < E(VL) < UB 4 - ka(VL) 

LB 5 + kcr(W) < E(W) < UB 5 - ka(VV) 

LB 6 + ka(BWA) < E(BWA) < UB 6 - kcr(BWA) 
lb + k<j(x) < d 1 <ub- k<j(x) for i = 1,2,. ..,5 
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( 14 ) 


jl 2 = arg max (w * E(GW ) + (1 - w) * <j(GW )) 
s.t. Z l < n z <Z U for i=l 

where the bounds Z/ and Z u for the mean of the non-design epistemic variable SepQ are 
calculated by Eq. (10) as given in Section 2.1.1. Note that in Eq. (14), we do not use the 
robust design constraints, since the objective function in this case is a function of all non- 
design epistemic variables. 

As mentioned earlier in Section 2, w > 0 is the weight parameter that represents the 
relative importance of the objectives and k is a constant that adjusts the robustness of the 
method against the level of conservatism of the solution. In this paper, k is assumed to be 
unity. 

Variances of the random variables x and z are estimated as single point values. 
Confidence intervals for the variances are estimated for each random variable described 
by the sparse point data. The weight parameter w is varied (from 0 to 1) and the 
optimization problem in Eqs. (13)-(14) are solved iteratively until convergence by the 
Matlab solver ’fmincon’ for different sample sizes (n) of the sparse point data. In each 
case, the optimization problems converged in less than 5 iterations. Here, ‘fmincon’ uses 
a sequential quadratic programming (SQP) algorithm. The estimate of the Hessian of the 
Lagrangian is updated using the BFGS formula at each iteration. The convergence 
properties of SQP have been discussed by many authors including Fletcher (1987) and 
Panier and Tits (1993). 
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The solutions are obtained by solving the problem using the upper confidence 
bound for the variances of the random variables x and z. The solutions are presented in 
Figure 2. 



Figure 2: Robustness-based design optimization with sparse data for different sample 

sizes (n) 


It is seen in Figure 2 that the solutions become more conservative (i.e., the mean 
and standard deviation of GW assume higher values) as we add uncertainty to the design 
problem. It is also seen from Figure 2 that as the sample size (n) increases, both the 
standard deviation and mean of GW decrease. As gathering more data reduces data 
uncertainty, the solutions become less sensitive (i.e., the standard deviation of GW 
assumes lower value) to the variations of the input random variables as the sample size 
(n) increases. Also, looking at the mean of GW, it is seen that as the uncertainty 
decreases with sample size, the optimum mean weight required is less. 


3. 2 Determination of optimal sample size for sparse point data 
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The optimal sample size formulations are illustrated here for the TSTO design 
problem. The formulations are relaxed by assuming that standard deviations of the data 
do not change significantly as sample size changes. To make the problem simpler, we 
first relax the integer requirement on the optimal sample size n and then round off the 
solution for n to the nearest integer value. The input variable SepQ is assumed to be a 
non-design epistemic variable and all the remaining variables are assumed to be design 
variables. The design bounds for the respective design variables and the analysis outputs 
remain the same as in Tables 3 and 4. 

Therefore, the design problem becomes as follows: 


[<f’,n^] = argmin (w*E(GW) + v*cr(GW) + (l-w-v)*(5n dl +10 n d2 +5 n d3 + 5 n d4 + 4 n d5 + 6 /?*)) 

d,n d 

s.t. LB t + a(GW) < E(GW) < UB 1 - <r(GW ) 

LB 2 + a(EW ) < E(EW) < UB 2 - a(EW) 

LB, + cj( PFR) < E(PFR) < UB 3 - a(PFR) 

LB 4 + a(VL) < E(VL ) < UB 4 - cr(VL) (1 5) 

LB S + cr(W) < E(W ) < UB 5 - a(FV) 

LB 6 + a(BWA) < E(BWA) < UB 6 - a(BWA) 
lb + kcr(x) < d j <ub - k<j(x ) for all i = 1,2,..., 5 

5 n dl + 10 n d2 +5n 3 +5 n di +4 n d5 +6n* e <1050 
n d < 30 for j = 1,2,. ..,5 

[//.*, n e ] = arg max (w* E(GW) + v* <j(GW) + (l-w-v)*{5n* +10 n* +5 n* d} +5n* +4n* +6 n e )) 

s.t. Z,{n e )<fJ z <z u (n e ) (16) 

5 n +10 n +5 n* +5 n +4 n +6n e < 1050 

d\ d2 “3 d 3 d 5 

n < 30 for j = 1 

e j 
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We have solved this problem for different combinations of weights w and v and 
the optimal solutions are presented in Table 6. In each case, the optimization problems 
converged in less than 4 iterations. 


Table 6: Objective function values at optimal solutions and optimal sample sizes 


Weights 

Objective function Value 

Optimal Sample Sizes 

w 

V 

1- 

w- 

V 

Mean GW 

Std GW 

Total 

Cost 

n d i 

n d 2 

n d 3 

n d 4 

n d5 

n e 

0 

0 

1 

1.61 18e+005 

6.3732e+004 

455.3008 

5 

10 

15 

8 

9 

30 

0.6 

0.2 

0.2 

1.4684e+005 

5.3219e+004 

539.8948 

6 

10 

30 

8 

10 

30 

0.5 

0.4 

0.1 

1.4878e+005 

5.0526e+004 

593.6961 

7 

10 

30 

14 

15 

30 

0.5 

0.5 

0 

1.5143e+005 

4.7604e+004 

886.9363 

25 

25 

30 

30 

30 

15 


It is seen in Table 6 that the total cost incurred in obtaining samples is the 
minimum when we solve the problem giving the maximum importance on the total cost. 
In this case, we get the most conservative robust design i.e., the mean and the standard 
deviation of GW assume the maximum of all possible values. Note that the optimal 
sample size required is also the minimum in this case. As we give more importance on 
the mean and standard deviation of GW, the total cost and also the optimal sample size 
increase with a decrease in both the mean and standard deviation of GW. 

3. 3 Robustness-based design optimization with sparse point and interval data 

The methodology proposed in Section 2.1 is illustrated here for the same TSTO 
problem. Here, it is assumed that the design variable ExpRatio is described by sparse 
point data as given in Table 5, the design variable Payload is described by multiple 
interval data as given in Table 7 and the design variables SepMach and SepQ are 
described by single interval data as given in Table 8. The non-design epistemic variables 
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Sep Angle and Fineness are described by the sparse point data (as given in Table 5) and 
the single interval data (as given in Table 7), respectively. The design bounds for the 
respective design variables and the analysis outputs remain the same as in Tables 3 and 4. 
Table 7: Multiple Interval Data for the random input variables 


Payload 


[25000, 28000], [26000, 29000], [25000, 29000], [26000, 30000], 
[25000, 30000] 


Table 8: Single Interval Data for the random input variables 


SepMach 

T9, 10] 

SepQ 

[100, 120] 


The design problem is now formulated as follows: 
d* =argmm(w*fs(GIF) + (l- w)*a(GW)) 

d 

s.t. LB \ + ka(GW) < E(GW) < UB X - ka(GW ) 

LB 2 + ka(EW ) < E(EW) < UB 2 - ka(EW) 

LB, + ka(PFR) < E(PFR) < UB, - ka(PFR) (1 5) 

LB 4 + k<j(VL) < E(VL) < UB 4 - ka(VL) 

LB 5 + ka(FV) < E(W) < UB 5 - ka(FV) 

LB 6 + ka(BWA) < E(BWA) < UB h - ka(BWA ) 
lb + k<j(x) < d t <ub- ka(x ) for i = 1 ,2,3,4 

ju* : =argmax(w*fs(GIF) + (l- w)* cr(GW)) (16) 

Mx 

s.t. Z l < n _ < Z u for i = 1,2 

where the bounds Z/ and Z„ for the mean value of the non-design epistemic variable 
Sep Angle are calculated by Eq. (10) as given in Section 2.1.1 and those for the epistemic 
variable Fineness are calculated by the method described in Section 2.1.3. Note that in 
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Eq. (16), we do not use the robust design constraints, since the objective function in this 
case is a function of all non-design epistemic variables. 

Variances of the random variables ExpRatio and Sep Angle are estimated as single 
point values. Confidence intervals for the variances are estimated for each random 
variable described by sparse point data. Bounds on the variances of the random variables 
SepMach, SepQ, Fineness, and Payload are estimated by the methods described in 
Sections 2.1.3. The free parameter w is varied (from 0 to 1) and the optimization 
problems in Eqs. (15) and (16) are solved iteratively until convergence. In each case, the 
optimization problems converged in less than 5 iterations. The solutions are obtained by 
solving the problems using the upper confidence bound on sample variance for the 
random variables ExpRatio and SepAngle, and the upper bound on sample variances for 
the random variables Payload, SepMach, SepQ and Fineness. The solutions are presented 
in Figure 3. 



Figure 3: Robustness-based design optimization with non-design epistemic variables 
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Figure 3 shows the solutions of the conservative robust design in presence of 
uncontrollable epistemic uncertainty described through mixed data i.e., both sparse point 
data and interval data, which is seen frequently in many engineering applications. 

4. Summary and Conclusion 

This paper proposed several formulations for robustness-based design 
optimization under data uncertainty. Two types of data uncertainty - sparse point data 
and interval data - are considered. The proposed formulations are illustrated for the upper 
stage design problem of a TSTO space vehicle. A decoupled approach is proposed in this 
paper to un-nest the robustness-based design from the analysis of non-design epistemic 
variables to achieve computational efficiency. As gathering more data reduces 
uncertainty but increases cost, the effect of sample size on the optimality and the 
robustness of the solution is also studied. This is demonstrated by numerical examples, 
which suggest that as the uncertainty decreases with sample size, the resulting solutions 
become more robust. We have also proposed a formulation to detennine the optimal 
sample size for sparse point data that leads to the solution of the design problem that is 
least sensitive (i.e., robust) to the variations of design variables. In this paper, we have 
used the weighted sum approach for the aggregation of multiple objectives and to 
examine the trade-offs among multiple objectives. Other multi-objective optimization 
techniques can also be explored within the proposed formulations. 

The major advantage of the proposed methodology is that unlike existing 
methods, it does not use separate representations for aleatory and epistemic uncertainties 
and does not require nested analysis. Both types of uncertainty are treated in a unified 
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manner using a probabilistic format, thus reducing the computational effort and 
simplifying the optimization problem. The results regarding robustness of the design 
versus data size are valuable to the decision maker. The design optimization procedure 
also optimizes the sample size, thus facilitating resource allocation for data collection 
efforts. Due to the use of a probabilistic format to represent all the uncertain variables, 
the proposed robustness-based design optimization methodology facilitates the 
implementation of multidisciplinary robustness-based design optimization, which is a 
challenging problem in presence of epistemic uncertainty. 
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